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Abstract: Recent mobile equipment (as well as the norm IEEE 802.21) 
now offers the possibility for users to switch from one technology to another 
(vertical handover). This allows flexibility in resource assignments and, con- 
sequently, increases the potential throughput allocated to each user. 

In this paper, we design a fully distributed algorithm based on trial and 
error mechanisms that exploits the benefits of vertical handover to find fair 
and efficient assignment schemes. On the one hand, mobiles gradually update 
the fraction of data packets they send to each network based on a value 
called repercussion utility they receive from the stations. On the other hand, 
network stations compute and send repercussion utilities to each mobile that 
represent the impact each mobile has on the cell throughput. 

This repercussion utility function is closely related to the concept of 
marginal cost in the pricing literature. Both the station and the mobile 
algorithms are simple enough to be implemented in current standard equip- 
ment. 

Based on tools from evolutionary games, potential games, replicator dy- 
namics and stochastic approximations, we analytically show the convergence 
of the algorithm to solutions that are efficient and fair in terms of through- 
put. Moreover, we show that after convergence, each user is connected to a 
single network cell which avoids costly repeated vertical handovers. 

* A study of allocation games has been included. 
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Several simple heuristics based on this algorithm are proposed to achieve 
fast convergence. Indeed, for implementation purposes, the number of itera- 
tions should remain in the order of a few tens. We finally provide extensive 
simulation of the algorithm in several scenarios. 

Key-words: Distributed Algorithms, Hybrid Wireless Networks, Evolu- 
tionary Games, Potential Games, Replicator Dynamics, Vertical Handover, 
Fairness, Stochastic Approximation. 
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Un algorithme distribue pour une association 
utilisateur-reseau efRcace et equitable dans les 
reseaux sans fils multi-technologiques 

Resume : Les equipements mobiles recents (tels que definis dans la norme 
IEEE 802.21) permettent aux usagers de basculer d'une technologie a I'autre 
(ce que Ton nomme "handover vertical"). Plus de souplesse est autorisee dans 
I'allocation des ressources et, par consequent, cela augmente potentiellement 
les debits alloues aux usagers. 

Dans cet article, nous concevons un algorithme distribue qui procede 
par tatonnement pour obtenir une association utilisateur-reseau efficace et 
equitable, afin d'exploiter les benefices du "handover vertical". D'une part, les 
mobiles mettent a jour pas a pas la proportion de paquets de donnees qu'ils 
envoient sur chaque reseau a partir d'une valeur transmise par la station de 
base. D'autre part, les stations de base calculent et envoient cette valeur aux 
mobiles. Cette valeur, appelee "repercussion utility " represente I'impact que 
chaque mobile a sur le debit global du reseau. 

Cette fonction d'utilite est a rapprocher de I'idee du cotit marginal dans 
la litterature sur la tarification. Aussi bien I'algorithme de la station de base 
que celui du mobile sont suffisamment simples pour etre implementes dans 
les equipements standards actuels. 

A partir de methodes des jeux evolutionnaires, des jeux de potentiel, 
de la dynamique de replication, et des approximations stochastiques, nous 
montrons de maniere analytique la convergence de I'algorithme vers une 
solution efficace et equitable en terme de debit. De plus, nous montrons 
qu'une fois I'equilibre atteint, chaque utilisateur est connecte a un unique 
reseau ce qui permet de supprimer le cout du "handover vertical". 

Plusieurs heuristiques reposant sur cet algorithme sont proposees afin 
d'obtenir une convergence rapide. En effet, pour des raisons d'ordre pratique, 
le nombre d'iterations doit demeurer de I'ordre de quelques dizaines. Nous 
comparons alors la qualite des solutions fournies dans divers scenarios. 

Mots-cles : Algorithmes distribues, reseaux sans-fils heterogenes, inter- 
connection de reseau, theorie des jeux evolutionnaires, jeux de potentiels, 
dynamique de replication, handover vertical, equite, approximation stochas- 
tique. 
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1 Introduction 

The overall wireless market is expected to be served by six or more major 
technologies (GSM, UMTS, HSDPA, WiFi, WiMAX, LTE). Each technology 
has its own advantages and disadvantages and none of them is expected to 
eliminate the rest. Moreover, radio access equipment is becoming more and 
more multi-standard, offering the possibility of connecting through two or 
more technologies concurrently, using the norm IEEE 802.21. Switching be- 
tween networks using different technology is referred to as vertical handover. 
This is currently done in UMA, for instance, which gives an absolute priority 
to WiFi over UMTS whenever a WiFi connection is available. In this paper, 
in contrast, we address the problem of computing an efficient association by 
providing a distributed algorithm that can be fair to all users or efficient in 
terms of overall throughput. Here are the theoretical contributions of the 
paper. 

- First, we propose a distributed algorithm with guaranteed convergence to a 
non-cooperative equilibrium. This algorithm is based on an iterative mech- 
anism: at each time epoch the mobile nodes adapt the proportion of the 
traffic they send on each network, based on some values (caled repercussion 
utilities in the following) they receive from the network. This work is in line 
with some recent work on learning of Nash equilibria (see, for instance, [l] 

M)- 

- Second, based on tools from potential games, we show that, by appro- 
priately setting up the repercussion utilities, the resulting equilibria can be 
made efficient or fair. 

- Last, we show that the obtained equilibrium is always pure: after conver- 
gence, each user is associated to a single technology. 

To validate our results, we propose several practical implementations 
of the algorithm and assess their performance in the practical setting of a 
geographical area covered by a global WiMAX network overlapping with 
several local IEEE 802.11 (also called WiFi) cells. We suppose that each 
user can multi-home, that is to say split her traffic between her local WiFi 
network and the global WiMAX cell, in order to maximize her repercussion 
utility (to be defined later). 

The integration of WiFi and UMTS or WiFi and WiMAX technologies 
has already received some attention in the past. 

There is a family of papers looking for solutions using Markov or Semi- 
Markov Decision Processes [31 [4]. Based on Markovian assumptions upon 
the incoming traffic, these works provide with numerical solutions, so as 
to optimize some average or discounted reward over time. Yet, because 
of the complexity of the system at hand (the equations of the throughput 
in actual wireless systems are not linear, and not even convex), important 
simplifying assumptions need to be made, and the size of the state space 
quickly becomes prohibitive to study real systems. Moreover, these methods 
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require to precisely know the characteristics of the system (e.g. in terms 
of bandwidth achieved in all configurations, interference impact of one cell 
over the neighboring ones, rate of arrivals), data that are hardly available in 
practice. 

Our approach is rather orthogonal as we seek algorithms that converge 
towards an efficient allocation, using real-time measurements rather than 
off-line data. Such an approach follows game theory frameworks. There has 
been recent work that, based on evolutionary games [5], provide with opti- 
mal equilibria. Evolutionary games [HI [7], or the closely-related population 
games, are based on Darwinian-like dynamics. The evolutionary game lit- 
erature is now mature and includes several so-called population dynamics, 
which model the evolution of the number of individuals of each population as 
time goes by. In our context, a population can be seen as a set of individuals 
adopting the same strategy (that is to say choosing the same network cell 
in the system and adopting identical network parameters). Recent work [5] 
have shown that, considering the so-called replicator dynamics, an appropri- 
ate choice of the fitness function (that determines how well a population is 
adapted to its environment) leads to efficient equilibria. However they do 
not provide with algorithms that follow the replicator dynamics (and hence 
converge to the equilibria). Additionally they do not justify the use of evo- 
lutionary games. Indeed, such games assume a large number of individuals, 
each of them having a negligible impact on the environment and the fitness 
of others. This assumption is not satisfied here, where the number of active 
users in a given cell is on the order of a few tens. The arrival or departure 
of a single one of them hence significantly impacts the throughput allocated 
to others. As the number of players is limited, we are hence dealing with 
another kind of equilibria, namely the Nash Equilibria. 

The third trend of this article concerns Nash equilibria learning mech- 
anisms. In the context of load balancing, a few algorithms (see, for in- 
stance [HE]) have been shown to converge to Nash Equilibria. Interestingly 
enough, it has been pointed out that this class of algorithms has similar 
behavior and convergence properties as replicator dynamics in evolutionary 
game theory. It is to be noted that the main weakness of these algorithms is 
that they may converge to mixed strategy Nash equilibria, that is to say to 
equilibria where each user randomly picks up a decision at each time epoch. 
Such equilibria are unfortunately not interesting in our case, as they amount 
to perpetual handover between networks. 

Finally, there is a growing interest in measuring or analyzing the effi- 
ciency of Nash Equilibria. The most famous concept is certainly the "price 
of anarchy" [8]. Let us also mention the more recent SDF (Selfish Degra- 
dation Factor) [9]. We will show in the following that the Nash Equilibria 
our algorithm converges to are locally optimal with respect to these two cri- 
teria. In addition, it has interesting fairness properties. Indeed, we show 
how our algorithm can be tuned so as to converge to a-fair points (defined 
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in cooperative game theory, see [10|). for arbitrary value of the parameter 
a. This wide family of fairness criteria includes in particular the well-known 
max-min fairness and proportional fairness and can be generalized so as to 
cover the Nash Bargaining Solution point [TT] . 

In the present paper, we hence propose to make use of the previous works 
in evolutionary games on heterogeneous network, with additional fairness 
considerations, while proposing methods based on works on Nash learning 
algorithms that can be implemented on future mobile equipments. In addi- 
tion, our work present a novel result which is that our algorithm converges 
to pure (as opposed to mixed) equilibria, preventing undesired repeated han- 
dovers between stations. 

2 Framework and Model 

In this section, we present the model and the objective of this work while 
introducing the notations used throughout the paper. 

2.1 Interconnection of Heterogeneous wireless networks 

We consider a set M of mobiles, such as mobile n can connect to a set of 
network cells, that can be of various technologies (WiFi, WiMAX, UMTS, 
LTE...). The set of cells that user^ can connect to, depends on their geo- 
graphical location, wireless equipment and operator subscription. 

2.2 User throughput and cell load 

By throughput, we refer to the rate of useful information available for a user, 
in a given network, sometimes also called goodput in the literature. 

The throughput obtained by an individual on a network depend on both 
her own parameters and the ones of others. These parameters include ge- 
ographical position (interference and attenuation level) as well as wireless 
card settings (coding schemes, TCP version, to cite a few). In previous pa- 
pers [3111], the authors discretize the cells of networks into zones of identical 
throughput (see FigH]). This means that users in the same zone will receive 
the same throughput. Here, we can consider that each user is in its own 
zon^. The set of users connected to a network is called the load of the 
network. 

More formally, we suppose that each user has a set of network cells she 
can connect to denoted by X^- An action Sn for user n is the choice of a cell 
i € Tn- Then, we denote by s the vector of users actions s = {sn)neAf, and 
call it an allocation of mobiles to networks. 

^In the following we use the term users and mobiles interchangeably, 
^unlike in the cited papers, we are not constrained by the size of the system that is 
increasing with the number of zones. 
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Figure 1: An heterogeneous wireless system consisting of a single MAN 
(Metropolitan Area Network, e.g. WiMAX) cell and a set of partly overlap- 
ping LAN (Local Area Networks, e.g. WiFi) hot-spots (in grey). As user 
B (in zone 1) is closer to the WiMAX antenna, it can use a more efficient 
coding scheme than A (in zone 2) (for instance QAM 16 instead of QPSK). 
Zones are represented with a dash line, as opposed to cells, with full lines. 

Then, for each allocation s, the load of network i is denoted by t{s) € 
{0; 1}^, and is such that C(s) = 1 if user n takes action i, otherwise. The 
throughput Un{i^{s)) of user n taking action i is a function depending only 
of the vector of load of cell i. With these notations, the throughput received 
by n when she takes decision Sn is u„(£*"(s)). 

2.3 Pure versus mixed strategies 

As opposed to multi-homing between WiFi systems (see [lH [5]), multi- 
homing between different technologies (e.g. WiFi and WiMAX) induces 
several complications: the different technologies may have different delays, 
have different packet sizes or coding systems,... and re-constructing the 
messages sent by the mobiles may be hazardous. Hence, while each user 
can freely switch between the networks cells she has access to, we aim at 
algorithms that converge - after a transitional state - to equilibria in which 
each user uses a single network (so as to avoid the cumbersome handover 
procedure). These are called pure strategy equilibria (see Section r3.4p . 

Yet, during the convergence phase, each mobile is using mixed strate- 
gie^. Then, the experienced throughput needs to be considered in terms of 

''The formal definition is given in Section [3.41 
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expectations. In this case, g„ is a vector of probabilities where qn,i is the 
probabihty for mobile n chooses cell i £ In- The global strategy set is the 
matrix q = {qn)neJ\f, while the choice Sn{q) is now a random variable such 
that ¥{Sn{q) = i) = qn,i- It follows that the expected throughput received 
by user n is E[un{i^"^''\S{q)))], where S{q) = 

2.4 Efficiency and Fairness 

In our approach, we consider elastic or data traffic. Then, the Quality-of- 
Service (QoS) experienced by each mobile user is its experienced throughput. 
We are hence interested in seeking equilibria that are optimal (in the sense 
of Pareto) in terms of throughput. Such equilibria is a strategy q such that 
one cannot find another strategy q' that increases the expected through- 
put of a user without decreasing that of another one: Vg' ^ q,3n G M s.t. 
E[un{i^"^'^'\S{q')))] > E[n„(^^"W(5((?)))] 3?n G A/", E[n^(^^™('^')(5(g')))] < 
E[um{i^-(i\S{qm. 

We design a fully distributed algorithm that converges to points which 
are not only Pareto optimal but also a- fair. The class of a-fair points [10| . 
achieves 

max V E[G„K(£^"('')(5((7))))] with (1) 

In the case of pure strategies, for each mobile n such that Sn = i, 
K[un{t{S))] = Un{t{s)). So, wc aim at building an algorithm that con- 
verges to an allocation s* that reaches 

max V Ga{un{t{s))). 

s '■ — ' 

nGA/" 

When a = 0, the corresponding solution is a social optimum. When a 
tends to one, the solution is a proportional fair point (or Nash Bargaining 
Solution) and when a tends to infinity, it converges to a max-min fair point. 
The parameter a hence allows fiexibility in choosing between fully efficient 
versus fair allocation, while ensuring Pareto optimality. 

Finally, it is well-known that selfish behavior in the use of resources 
(networks) may lead to inefficient use, in case of congestion for example. To 
circumvent this, we introduce some repercussion utility functions that are 
notified to users. Thus, instead of competing for throughput, we consider 
an algorithm refiecting a non-cooperative game between users that compete 
formaximizing their repercussion utility. We will give an explicit closed-form 
of the repercussion utility function in Section [321 As in the throughput case, 
the repercussion utility on a cell only depends on the load on that cell. We 
denote by r„(£'^"(s)) the repercussion utility received by user n (as for the 
throughput, the repercussion utility received by that user also depends on 
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the choices of the other mobiles of the system, as reflected in the allocation 
vector s). In the case of mixed strategies, the expected repercussion utility 
is ¥,[rn{£^" (S))]. The study of such games is given in the next section. 

3 Allocation Games Related to Potential Games 

This section is devoted to the formal study of allocation games. After defin- 
ing what is an allocation game in Section 13.11 we introduce the repercussion 
utilities in Section [3?2l what leads to a new game that is characterized in Sec- 
tion 13.31 Finally, we show the useful property that this game is a potential 
game (Section 13.41) . 

3.1 Allocation Games 

We consider a normal-form game {J\f,I,U) consisting of a set M of players 
{\Af \ = N), player n taking actions in a set In d S (|Z„,| = /„), where S is 
the set of all actions. Let us denote by s„ G Xn the action taken by player n, 
and s = {sn)nej^ £ 1 = (H)^=i^n- Then, U = {Un)n&Af refers to the utility 
or payoff for each player: the payoff for player n is Un{si, . . . , s„, . . . , sj\f)- 

By definition, an allocation game is a game such that the payoff of a 
player when she takes action i only depends on the set of players who also 
take action i. One can interpret such a game as a set of users who share a 
common set of resources 5, and an action vector corresponds to an allocation 
of resources to users (hence the name of these games). 

We define the load on action (or resource) i by t{s) S {0; 1}^ as a vector 
such that in{s) = 1 if player n take action i, otherwise. When there is no 
ambiguity, we will simplify the notation and use i = t{s). We denote by 
£*"(s) the load on the action taken by player n, and we denote the payoff for 

player n by n„(^''"(s)) =^ C/„(si, . . . . . .,sn)- 

Hence, allocations games are a wider class of games than congestion 
games where the payoff of each player depends on the number of players 
adopting the same strategy [13]. They represent systems where different 
users accessing a given resource may have a different impact. 

3.2 Repercussion utilities 

We build a companion game of the allocation game, denoted (AA, X, 7^). The 
new player utilities, called repercussion utilities are built from the payoffs of 
the original game, according to the following definition: 

Definition 1 (allocation game with repercussion utilities). Let us consider 
the repercussion utility for player n to be: 
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m^n:Sm=Sn 

where denotes the vector whose entries are all but the n*^' one, which 
equals 1. 

An allocation game with repercussion utilities is a game whose payoffs 
are repercussion utilities. 

The utilities defined in this manner have a natural interpretation: it 
corresponds to the player's payoff (tin(£^"(s))) minus the total increase in 
payoff for all users impacted by the presence of a given user on a given 
commodity ( [um{i^"^ {s) — Cn) — nm(^*'"(s))]). This is more obvious in 

the following equivalent formulation. 

Remark 1. An equivalent formulation of the repercussion utilities is: 

m:t^=l m^n:t^ {s)=l 

3.3 Characterization of Allocation Games with Repercussion 
Utilities 

We now give a characterization of a payoff that is a repercussion utility. 

Proposition 1. An allocation game {M,T,7V) is an allocation game with 
repercussion utilities if and only if\/£,yn,m £ J\f s.t. Sm = Sn, 

rn{£) - rn{l - em) = tM - Tmi^ " e„). (2) 

Proof. Suppose that r is a repercussion utility, then there exists a payoff u 
such that: 

'^n(^) = ^ WfcW - Y Ukii-en). 
4=1 k^n:£^=l 

Then, denote 



A= ^ Y'^kii - em) - Y Uk{£-en-e„ 

=1 fc^n;4=l 



Then, 



rni£) - rn{£ - em) =Y'^k{i)~ Ukii-en)-A 

4=1 k^n:£k=l 

= YukW- Y M^-em)-A 

4=1 k^m:ek=l 
— fm{,£) '^m(,£ en)- 
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Conversely, consider an allocation game (AA, I, TZ) such that Eq. [2] is 

def 

satisfied. Consider an action i and I the vector of load on action i. Let K = 
Ylin&N^^ is the number of players taking action i. Further, let {a{k)), 1 ^ 
k ^ K he the subscripts of all players taking action i. If there are K such 
players, then I = Ylik=i^a(k)- Then, we claim that, for any permutation a 
of {1,...,K}: 

K-l k K-l k 

ra{k+l){(- -^eal^i)) = ^a(a(fc+l))(-^ " 5^ ^..(^(j)) ) . (3) 

fc=0 3=1 k=0 j=l 

Indeed, note that, from Eq. [2l 

k-l \ / fe 

ra{k+i) [^-Yl ~ '^"C^+i) P ~ ^ ^-^(J) 

j=i / V i=i 

(k-l \ / k-l 



Therefore: 



k-l \ / k 

fc-l \ / k-l 



Hence, for any k, the sum X]^a(A:+i)(^ ~ Sj=i^a{j)) remains unchanged if 
one swaps a{k) and a(fc + l) (elementary transposition). Then, Eq. [3] results 
from the fact that any permutation a can be decomposed in a finite number 
of elementary transpositions. 

We now construct a payoff u as follow: for any n such that = 1, let us 
define: 

^ K-l k 
Un{i) = -J7 2Z Mfc+l)(^-X]^«(i))• 



A:=0 j=l 



Then, 



^ timW - Y^ iUm{i-en)) = 
£m=l m^n:£m=l 

K-l k K-2 k 

Y ~ ~ Y '^b{k+i){^ - en -Y^bU)^- 

k=0 j=l k=0 j=l 



RR n° 6653 



12 



P. Coucheney, C. Touati, B. Gaujal 



Note that the sequence a is identical to sequence b with the additional 
element n. From Eq.[3l we can choose a permutation a such that a{a{l)) = n. 
Then: 



E 



Um{^) - ^ {Um{^ - e„)) 







-m — 1 






K-l 




k 




A: 


= E 


ra{a{k+l))( 




■ '^a(CT((fc+l))(^ - en 




k=0 






fc=l 


i=2 


K-l 




k 






= E 


ra{a{k+l)){ 




ea((7(j))) + ra{a{l)){^)- 




fc=i 




i=2 


K-1 

'^a(<7((fc+l))(^ - e„ 

/c=l 


A: 

i=2 



Hence (M, I, TZ) is the allocation game with repercussion utilities asso- 
ciated to the {M,I,U) allocation game. □ 

From Prop. [U we conclude that allocation games with repercussion util- 
ities are a special subset of allocation games. The results presented in the 
following are hence valid for any allocation game such that Eq. [2]is satisfied. 

Example 1. Let M be the payoff matrix of a two-player game. This amounts 
to saying that the first (resp. second) player chooses the line and the second 
chooses the column. The payoff for the first player is given by the first (resp. 
second) component. 

{a, A) {b,B) 
(c,C) id,D) 

It follows from Proposition [I] that this is a game with repercussion utilities 
if and only ifa = A + b — C and d = D + c — B . Then, one can check the 
interesting property that there necessarily exists a pure Nash equilibrium (for 
instance {a, A) is a Nash equilibrium if a ^ c and A ^ B). 



M 



3.4 Allocation Games with Repercussion Utilities are Poten- 
tial Games 

In this section, we show that, given an allocation game, the game with reper- 
cussion utilities (1) admits a potential function and (2) this potential equals 
the sum of the payoff's for all players in the initial game. This appealing 
property is exploited in the next section to show some strong results on the 
behavior of the well-known replicator dynamics on such games. 

Consider an allocation {M,T,ly() and its companion game {M,I,7l). We 
first assume that players have mixed strategies. Hence a strategy for player 
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n is a vector of probability qn = {qn,i)iei„, where qn,i is the probability for 
player n to take action i (i.e. qn^i ^ and 1n,i = The strategy 

domain for player n is A„ =^ {0 ^ qn^i ^ 1, s.t. X^jgj^ qn,i = !}• Then, the 

global domair0 is A = A„ and a global strategy is q *== {qn)n£Af- We 

say that g is a pure strategy if for any n and i, qn^i equals either or 1. 

We denote by S the random vector whose entries 5„ are all independent 
and whose distribution is Vn S A/", Vi S In, ^{Sn = i) = qn,i- The expected 

payoff for player n when she takes action i is fn,i{<i) '= lE[r„(^*(5))|5„ = i]. 

Then, her mean payoff is '= Qn,ifn,i{Q)- We can notice that fn,i{<l) 

only depends on {qm,i)m^n and it is a multi-linear function of {qm.,i)m^n- 

The next theorem claims that the allocation game with repercussion util- 
ities is a potential game. Potential games were first introduced in [14]. The 
notion was afterward extended to continuous set of players [15]. In our case, 

it refers to the fact that the expected payoff for each player derives from a 

dF 

potential function. More precisely, we show that fniio) = (9), where 

dqn,i 



= J2 E Qn,inUn{t{S))\Sn = i]. (4) 



It is interesting to notice the connection between fn,i{q) which is the expected 
repercussion utility, and F{q) which refers to the sum of expected payoffs in 
the initial game. A strategy that increases the expected repercussion utility 
of a player, yields to a marginal increase of the potential. 



Theorem 1. The allocation game with repercussion utilities is a potential 
game, and its associated potential function is F , as defined in Eq. 



Proof. Let us first differentiate function F: 



-{q)=E[Unie{S))\Sn = i]+ y qm,r 



oqn,i ^ oqn,i 



*Notice that A is a polyhedron. 
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InfaC. i, is clea. that ^nu,.(if))\S„, = il ^ „ ^ aEK(<-(S))|S„ = H 



dqn,i ' 
0. To simplify the notations, we omit the index i. Then, 

— {q) = E[UniiiS))\Sn =i]+Yl 5^^X„(£)P(^(5) = i\Sm = l) 

(^Qn , (Jin „ 

fX Vi Ti IL 

= EK(£(S))|5„ = i] + 

^ 5^ (lP(^(5) = i\S„, = i,Sn = i)- Wis) = i\S.m =i,Sn^i 

= E[n„(£(S))|S„ = i] + 

^™(^) (^(^(^) = ^' ^™ = ^1^" = ^) - ^(^(-5) = ^ + en, 5™ = = i 

= nUn{^{S))\Sn =i\- {^M^{S) - en)\Sn = i\ - EK(£(5))|5„ = , 

m^n:Sm=Sn 

= nrn{^{S))\Sn = i] 

= fnM)- 

□ 

Remark 2. By adding a large constant to all payoff u, the repercussion 
utilities become positive. Clearly, this has no impact on the relative potential 
of allocations. The Nash Equilibria of the allocation game are also conserved. 
In the following, we will assume that the repercussion utilities are positive. 



4 Replicator dynamics and algorithms 

In this section, we show how to design a strategy update mechanism for all 
players in an allocation game with repercussion utilities that converges to 
pure Nash Equilibria. We will study in the next section (Section I4.4|) their 
efficiency properties. 



4.1 Replicator Dynamics. 

We now consider that the player strategies vary over time, hence q depends 
on the time t: q = q(t). The trajectories of the strategies are described 
below by a dynamics called replicator dynamics. We will see in section 14.21 
that this dynamics can be seen as the limit of a learning mechanism. 

Definition 2. The replicator dynamics fS^fJ^ is (Vn G M,i G In)-' 

-^(9) = {fn,M) - fn{q)) ■ (5) 

We say that q is a stationary point (or equilibrium point) if (Vn G AA, i G 2n)' 
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In particular, g is a stationary point implies Vn G J\f,i G X„, = or 

/n,i(g) =lniQ)- 

Intuitively, this dynamics can be understood as an update mechanism 
where the probability for each player to choose actions whose expected pay- 
offs are above average will increase in time, while non profitable actions will 
gradually be abandoned. 

Let us notice that the trajectories of the replicator dynamics remain 
inside the domain A. Also, from [15], the potential function F is a strict 
Lyapunov function for the replicator dynamics, that means that the potential 
is strictly increasing along the trajectories outside the stationary points. 

In this context, a closed set A is Lyapunov stable if, for every neigh- 
borhood B, there exists a neighborhood B' C B such that the trajectories 
remain in B for any initial condition in B' . A is asymptotically stable if it 
is Lyapunov stable and is an attractor (i.e. there exists a neighborhood C 
such that all trajectories starting in C converge to A). The existence of a 
strict Lyapunov function yields the following: 

Remark 3. The accumulation points of the trajectories of the replicator 
dynamics are stationary points. 

Intuitively, the limit points (that are connected) of the same trajectory 
must have the same value for the Lyapunov function. But the set of limit 
points is invariant for the dynamics, hence the Lyapunov function is non- 
increasing on this set. The remark follows. 

Proposition 2. All the asymptotically stable sets of the replicator dynamics 
are faces of the domain. These faces are sets of equilibrium points for the 
replicator dynamics. 

Proof. We show that any set which is not a face of the domain is not an 
attractor. This results from a property discovered by E. Akin [16] which 
states that the replicator dynamics preserves a certain form of volume. 

Let A be an asymptotically stable set of the replicator dynamics. Since 
the domain A is polyhedral, A is included in a face Fa of A. The support 
of the face S{Fa) is the set of subscripts {n,i) such that there exists q G A 
with qn.i 7^ or 1. The relative interior of the face is lnt{FA) = {q & 
F{A)s.t.y{n,i) G S{Fa),0< qn,^ < 1}. 

Furthermore, it should be clear that faces are invariant under the repli- 

def 

cator dynamics. Hence on the face Fa, by using the transformation Vn,i = 
log{ ""'^ ),Vg G Int(F^), one can see that 

d dVn i , , , r ^ 

' 0,V?i G AA,i GT. 



dVnA dt 



Up to this transformation, the divergence of the vector field is null on 
Fa- Using Liouville's theorem [T6], we infer that the transformed dynamics 
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preserves volume in lnt{FA). This implies that the set of limit points of the 
trajectories in lnt{FA) is lnt{FA) itself. By the previous remark, lnt{FA) is 
made of equilibrium points. By continuity of the vector field, all the points in 
face Fa are equilibria. Finally, since A is asymptotically stable, this means 
that A = Fa. □ 

We say that s = {sn)neM is a pure Nash Equilibrium if Vn G J\f, Vs'„ 7^ 

Sn, Un{si ...Sn--- Sn) > Un{si . . . s'^ . . . Sn)- 

Remark 4. Let q be a pure strategy. We denote by in the choice of player 
n such that qn,i„ = 1- Then, a pure strategy q is a Nash equilibrium is 
equivalent to: 

Vn G A/", Vj / in, fi„,n{q) ^ fjAl)- 

The following proposition comes form a classical result that says that 
the pure Nash equilibria are asymptotically stable points of the replicator 
dynamics. 

Proposition 3. // a stable face is reduced to a single point, then this of 
the replicator dynamics are pure Nash equilibria of the allocation game with 
repercussion utilities. 

Proof. Let q be an asymptotically stable point. Then g is a face of A by 
Proposition [2] (i.e. a 0-1 point), with, say qn,i = 1. Assume that q is not a 
Nash equilibrium. Then, there exists j ^ i such that fj^q) ^ fi,n{Q). Now, 
consider a point q' = q + eCnj — eCn^i- Notice that fn,iiQ') = fn,i{<i) since q' 
and q only differ on components concerning user n. Then starting in q' , the 
replicator dynamics is 

= (1 - e){fnM) - ((1 - ^)fnM + ^UAq)) 

= -e{l-e){fn,M-fn,m 

J dqn j I /■. dqn i / /\ ^ 

and^(,) = -^(,)^0. 

For all users m ^ n, Vu G Im, 4m,u ^ {0, 1}, then 

Therefore starting from q', the dynamics keeps moving in the direction 
^nj — e.n,i (or stays still) and does not converge to q. This contradicts the 
fact that q is asymptotically stable. □ 

Proposition 4. Allocation games with repercussion utilities admit at least 
one pure Nash equilibrium. 
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Proof. Allocation games with repercussion utilities admit a potential that is 
a Lyapunov function of their replicator dynamics. Since the domain A is 
compact, the Lyapunov function reaches its maximal value inside A. The 
argmax of the Lyapunov function form an asymptotically stable sets A of 
equilibrium points. By Proposition O these sets are faces of the domain 
(hence contain pure points). All points in A are Nash equilibrium points by 
using a similar argument as in Proposition [3l This concludes the proof. □ 

4.2 A Stochastic Approximation of the Replicator Dynam- 
ics. 

In this section, we present an algorithmic construction of the players' strate- 
gies that selects a pure Nash equilibrium for the game with repercussion 
utilities. A similar learning mechanism is proposed in [2]. We now assume a 
discrete time, in which at each epoch t, players take random decision Sn{t) 
according to their strategy qn{t), and update their strategy profile according 
to their current payoff. We look at the following algorithm (Vn G M, i G In)'- 

qn,i{i + 1) = 9n,i{i) + er-„(£^"(5)) {ls„=i - qn,i{t))^ (6) 

where 5^ = Sn{t), e > is the constant step size of the algorithm, and 
ls^=i is equal to 1 if 5^ = i, and otherwise. Recall that we assume that 
T^ni^^" (S)) ^ 0. Then, if e is small enough, Qn^i remains in the interval [0; 1]. 
Strategies are initialized with value q{0) = qq. The step-size is chosen to 
be constant in order to have higher convergence speed than with decreasing 
step size. 

One can notice that this algorithm is fully distributed, since for each 
player n, the only information needed is r„(^'^"(S')). Furthermore, at every 
iteration, each player only need the utility on one action (which is randomly 
chosen). In applicative context, this means that a player does not have to 
scan all the action before update her strategy, what would be costly. 

Below, we provide some intuition on why the algorithm is characterized 
by a differential equation, and how it asymptotically follows the replicator 
dynamics JSj. Note that we can re- write ([6|) as: 

qn,i{t + 1) = qn,i{t) + eb{qn,i{t),Sn(t)). 

Then, we can split b into its expected and martingale components: 

biqn,iit))=E[b{qn,i{t),Snm_ 

Ht) =b{qn,i{t),Sn(.t))-b{qn,i{t)). 
Again, ^ can be re-written as: 

— ■ = b[qn,i(t)) + u{t). 
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As ^{t) is a random difference between the update and its expectation, then 
by apphcation of a law of large numbers, for small e, this difference goes 
to zero. Hence, the trajectory of qn,i{t) in discrete time converges to the 
trajectory in continuous time of the differential equation: 

^^^=%n,i)i and 
q{0) = qo- 

Let us compute b{qn,i) (for ease of notations, we omit the dependence on 
time t): 

b{qn,i) = Hb{qn,i, 5„)] 

= gn,i(l - qn,i)fnM) " X] 1ri,jqn,ifn,j{q) 

= qn,i{fn,i{q) -^^qn^jfn^jiq)) 
_j 

= qn,i{fn,i{q) - f{q))- 

Then, qn,i{t) follows the replicator dynamics. 

Consider a typical run of algorithm ([6|) over a system made of 10 users 
with 5 choices over 10 networks. The figure displays for one user, the prob- 
abilities of choosing each of the 5 possible choices. As user has 5 possible 
choices, at time epoch 0, each choice has probability 0.2. Then, as t grows, 
all the probabilities except one, tend to 0. 




500 1000 1500 2000 

Time epocli t 



Figure 2: Convergence of the probability values for each of the 5 possible 
choices of one user. 
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4.3 Properties of the algorithm. 

The algorithm is designed so as to follow the well-known replicator dynamics. 
Furthermore, the stochastic aspect of the algorithm provides some stability 
to the solution: whereas the deterministic solution of a replicator dynam- 
ics may converge to a saddle point, this cannot happen with the stochastic 
algorithm. The use of repercussion utilities provides a potential to the com- 
panion game and it is known that the potential is a Lyapunov function for 
the replicator dynamics, hence the potential is increasing along the trajecto- 
ries. The following theorem aggregates the main results about the algorithm 
applied on repercussion utilities. 

Theorem 2. The algorithm |^ weakly converges to a set of pure points 
that are locally optimal for the potential function, and Nash equilibria of the 
allocation game with repercussion utilities. 

Proof. • The algorithm is a stochastic algorithm with constant step size. 
From Theorem 8.5.1 of Kushner and Yin [17], we infer that the algo- 
rithm weakly converges as e ^ to the limit points of the trajectories 
of an ode, which is, in our case, the replicator dynamics ([5]) (it is 
a particular case of the theorem in which conditions of the theorem 
hold: all variables are in a compact set and the dynamics is contin- 
uous). Furthermore, the set to which the sequence q{t) converges is 
an asymptotically stable set of the replicator dynamics, because unsta- 
ble equilibria are avoided (the noise verify condition of [TH], Theorem 
1). From Proposition [2], the only asymptotically stable sets of the dy- 
namics are faces. Hence the algorithm converges to faces which are 
asymptotically stable. 

• We now show that the dynamics in such a face (denoted by F) con- 
verges almost surely to a pure point. Let g(0) S F. Then, the trajec- 
tory q{t) following the algorithm stays in F. Furthermore: 



Since at a mixed stationary point fn,i{Q) = fniQ)^ then K[qn(t + 
= Qnit). Hence the process {qn{i))t is a martingale, and is 
almost surely convergent. The process converges necessarily to a fixed 
point of the iteration qn,i{t + l) = qn,i{t) + ern{(.''"{s)) (l^^^j - g„_j(t)), 
and the sole fixed points are pure points (since the step size e is con- 
stant). 



]E[gn,^(i + l)|g(t)] 

= 9n,i(t)(gn,i(t) + efn,i{q{t)){l - qn,i{^))) 




qn,i{t) + egn,i(/n,i(g(i)) " fni^H't)))- 
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• Let (g(i))tGN be the random process given by the algorithm. Suppose 
that it admits a closed set A of limit points that contains no pure 
points, such that A C F, where F is the smallest face of the domain 
A that contains A. Assume, for ease of notations that F = An {q : 
Qn,i = 0}. By Proposition O F is a face of A that is set of stationary 
points. 

Denote by A^ the 6 — neighborhood of A. We suppose that 6 is 
small enough to ensure that A^ does not contain any pure point (this 
is possible since ^ is a closed set). Let A the set of uj such that 
Vw eA,y6> 0, VT € N, 3t > T s.t. q{t) € A^ . We now show that the 
Lebesgue measure of A, denoted n{A), is null. Intuitively, as the algo- 
rithm goes near the face, the probability that it follows a martingale 
in the face is closed to 1, and then the trajectory will not approach the 
face. 

Let A be the set of u; such that the martingale (in F) q{t){uj) converges 
to a pure point for every initial condition in F. The measure of A is 
1. Let s{oj) = inf{r : Vg(0) G F,Vt > T,q(t){uj) ^ A^}. s{uj) is the 
maximal time such that for every initial condition in F, the martingale 
is outside A^ . Since F is compact, it follows that for all uj ^ A, s{ijj) 
is finite. Let A{T^) C A (resp. A{T~)) be the set of w such that 
s{u) > T (resp. s{oo) T). Then, /i(^(r+)) when T ^ oo. 

• Let <5 = |, where u > and A; € N*. If a trajectory q{t){u!) is such that 

there exists T with qni(T)(uj) < — , then there exists a duration T^. such 

k 

that Vt e [T — Tk;T], qn^i(t){uj) < where r = max„ max^ rri(£^"(s)). 

dcf IiTh (.k^ 

We now show that = min(T, — - — -). Indeed, qniU + 1) ^ 

m(l — er) 

qn,i{t) - erqn,i(t). Then qn,i{T) ^ qn,i(T - Tk)il - er)'^K It follows 
qn,i{T - Tk) ^ '7n,i(T)(l - er)-^'= < ^(1 - er)-^'= = u. 

• Let piv, Tk) be the probability that q{t){uo), at distance less than 6 = — 

k 

of F at time to, does not follow the martingale q{t){ijo) defined by 
q{to){uj) = projF{q{tQ){uj)), during time Tk (hence q{tQ + Tk){uj) can 
be inside A^). Then: 

VA: G N,/x(^) < ^l{A{T+))+p{u,Tk)^i{A{T^)). 

Indeed, let A; G N and uj ^ A. There is T with qni(T)(uj) ^ — . For 

k 

simplicity, suppose that T = T^. Then, either uj G A{Tj^), either 
UJ G A{Tf^), either the complementary set in A whose measure is 0. If 
UJ G A{Tfr), then q{T){uj) G A^ with probability p{u,Tk)- 
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We now show that, by taking an appropriate u = ^{k), then p{v{k),Tk) - 
0, when k ^ oo. This, and the fact that ijl{A{T^)) —>■ when k oo 
implies that n{A) = 0. 

Suppose uj G A{T~): let us define dt = d{q{t),q{t)) the distance be- 
tween the interior trajectory, and the martingale trajectory at time t. 
Then, one can check that, under a;, dt+i ^ dt{l + er), with probabil- 
ity at least 1 - p{dt) with p{dt) =^ dt J2n=i ^n- Let K =^ J2n=i ^n- 
Indeed the vector of actions s{q) is the same as s{q) as long as u; 
picks the same choice for all players. The contrary happens with 

N In i 

probability I 9n,fe(*) ~ Qn,k(t)\. See Figure [3] for an illus- 

n=l i=l k=l 

tration of this. Then, the lower bound follows from the inequality 
I Ylk=l Qn,k{t) - qn,k(t)\ < dt- 



T 



qi qi + q2 

j i 



qi 



92 




9l + 92 + 93 



Figure 3: The thick line shows the measure of the set of all uj corresponding 
to the same choices for player 1 (with 3 choices). 

Since do < u, then dj-^. > z^(l + er)^''' with probability less thanp(z/, Tfc) 

n n 

where p{u,Tk) = 1 - - ^ 1 - - i^i/(l + er)*). Take 

t=0 i=0 

(1 + er)"^^''- 

u = — . When k —>■ oo, dx^ goes to 0, and p{v,Ti^) goes to 

K 

0. Hence, q{t){uj) does not follow q{t){uj) for i = to t = with 
probability p{u,Tk), and then can be inside . 

• Finally, the fact that the pure point attained is a Nash equilibrium 
follows from Proposition [3l 

□ 



One can notice that the convergence of the algorithm to a pure point 
relies on the fact that the step size e is constant. If it were decreasing, the 
algorithm would converge to an equilibrium point in a stable face, that need 
not be pure. 

The combination of both algorithm ([6]) and repercussion utilities provides 
an iterative method to select a pure allocation which is stable, and locally 
optimal. This can be viewed as a selection algorithm. 
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4.4 Global Maximum vs Local Maximum for the Selection 
Algorithm. 

In the previous section, we showed that the algorithm converges to a local 
maximum of the potential function. This induces that if there is only one 
local maximum, the algorithm attains the global maximum. This arises for 
instance if the potential function is concave. Without the uniqueness of the 
local maximum, there is no guaranty of convergence to the global maximum. 
Hence, assume there are multiple local maxima (that are pure points), which 
is common when the payoffs are random. Each of them is an attr actor for the 
replicator dynamics. In this section, we investigate the following question: 
does the initial point of the algorithm belongs to the basin of attraction of 
the global maximum? 

Since every player has no preference at the beginning of the algorithm, 
we assume that initially, Vn G Af,i € Tn,Qn,i{0) = jf-^. In the following 
sub-section we show that in the case of two players, both having two choices, 
q{0) is in the basin of attraction of the global maximum. Then, in Subsec- 
tions 14.4.21 and 14.4.31 we give counter examples to show that the result does 
not extend to the general case of more than two players or more than two 
choices. 



4.4.1 Case of two players and two choices 

Proposition 5. In a two players, two actions allocation game with reper- 
cussion utilities, the initial point of the algorithm is in the basin of attraction 
of the global maximum. 

Proof. Both players 1 and 2 can either take action a or b. We denote by x the 
probability for player 1 to choose a, and by y the probability for 2 to choose 
a. We denote by = ih,j)i,j£{o,i} the matrix such that kij = F{i,j), 
where F{x, y) is the potential functiorH. Then, the dynamics ([5|) can be 
rewritten: 

( dx 

— = x{l- x){kQ^i - fco.o + Ky) 
' fy 

= y(l - y){ki,o - kofi + Kx), 

where K = A;i^i + A;o,o — feo,i — ^i,o- Note that in a two-player two-action game, 
there are at most two local maxima. Suppose that in the considered game, 
there are two local maxima. They are necessarily attained either at points 
(0,0) and (1,1) or at points (0,1) and (1,0). Without loss of generality, 
we can assume the former case. Hence, ko^ and fci^i are local maxima, and 
^1,1 > ^0,0 + 7, where 7 > 0. 

^Actually, here, the derivative of the potential is equal to the projection of the expected 
payoffs on the set A„. 
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We now define set E and function V as follows: 



V{x,y) = |1 - x| + |1 - yl, 

E = {{x,y) : X + y > 1, < X, y < 1}. 

{V is actually the distance of {x,y) to the point (1, 1) for the 1-norm.) We 
next show that V is a Lyapunov function for the dynamics on the open set 
E. To prove this, it is sufficient to show that 

First, note that V(x, y) G E, V{x, y) = 2 — x — y. Hence, from Eq. [71 
L{x, y) = -x{l - x)(A;o,i - A;o,o + Ky) - y{l - y)(A;i,o - A:o,o + Kx). 
Let also be D the open segment {(x, y) : x + y = 1, < x, y < 1}. Trivially, 
V(x, y) e D, L{x, y = l-x) = -x{l - x){ki,i - fco.o) < 0. (8) 
Let us finally consider the segment 

S{xo) = {{x, y) : X + y ^ 1, X = xo, ^ y ^ 1}. 
Figure m summarizes the different notations introduced. 

dV dV\ 
dx ' dy ' 




Figure 4: Proof of Prop. O Summary of notations 
Since E C S{x), it is sufficient to show the negativeness of L on 

0<x<l 

S{x) for all X. Let us denote by Lx{y) the restriction of L on 5(x). From 
Eq. [HI we have La;(l — x) < 0. Furthermore, Lx{y) is a quadratic function 
and its discriminant is 4(A;i^o — ^o,o)(^i,i — ^o,i); hence is negative. So, for 
all X, Lx{y) is negative (strictly). Finally, L is negative (strictly) in E and 
hence non-positive in a neighborhood of E. 

Therefore, y is a Lyapunov function for the dynamics on a neighborhood 
of the open set E. More precisely, V is strictly decreasing on the trajectories 
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of the dynamics starting in the set E, hence they converge to the unique 
minimum of V which is the point (1,1). This apphes to the initial point 
(0.5,0.5). □ 

Figure [5] illustrates this result: consider a two player (numbered 1 and 2), 
two strategy (denoted by A and B) game. Let x (resp. y) be the probability 
for player 1 (resp. 2) to take action A. While two (local) maxima exist - 
namely (1,1) and (0,0) - the surface covered by the basin of attraction of 
the global optimum (which is (1,1) in this example) is greater than those of 
the other one. A by-product is that the dynamics starting in point (0.5, 0.5) 
converges to the global optimum. 




Figure 5: An example with 2 players with 2 choices each. There are 2 max- 

'1 In 
^2' 2^ 



ima. The point (A, |) is inside the attracting basin of the global maximum. 



Unfortunately, this appealing result cannot be generalized to more play- 
ers or more actions, as exemplified in the following subsections. 

4.4.2 Extension to more than two players 

Example 2. Let us consider a three player game : with J\f = 

{1,2,3}, 1 = {A,B}, andU = (""„(«, j, A;))„g{i^2,3},i,i,fcG{A,B}, where i (resp. 
j), denotes the choice of player 1 (resp. 2). The matrix representation of 
{ui,U2,us) are given below: 



{ui,U2,U3){i,j, 1) 
(ni,M2,U3)(i,j,2) 



(9,6,4) (5,5,5) 

(5,8,1) (2,4,4) 

(7,2,8) (5,4,7) 

(6,3,3) (10,2,8) 
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Note that this game has no pure strategies Nash equilibrium and a single 
mixed strategies Nash equilibrium, which is {x,y,z) = (1/3,5/6,0). The 
corresponding value of the potential function is 87/6 = 14.5. 
The repercussion utility matrices are: 



{ri,r2,r3){i,j,l) 



iri,r2,r3){i,j, 2) 



(10,9,10) (6,5,5) 
(5,5,6) (1,1,4) 

(6,4,8) (5,3,7) 
(1,3,4) (9,11,14) 



This game has two pure Nash equilibria, that are {x,y,z) = (1,1,1) and 
{x,y,z) = (0,0,0), corresponding to values of the potential function that are 
respectively 29 and 34. 

Figure shows that the trajectory starting at point (^, ^, ^) converges 
to the local maximum {x,y,z) = (1,1,1) instead of the global maximum 
{x,y,z) = (0,0,0). Note that the performance of the local maximum is way 
ahead that of the Nash Equilibrium in the original game. 




Figure 6: Example with 3 players, with 2 choices each. The figure represents 
the dynamic trajectory starting from the point {x,y,z){0) = (^, ^, ^), with 
x (resp. y, z) the probability for player 1 (resp. 2, 3) to adopt action A. 
The dynamics converges to the point (1, 1, 1) whereas the global maximum 
is (0,0,0). 



4.4.3 Extension to more than two choices 

Example 3. Let us now consider the two player game {J\f,2,U) with J\f = 
{1,2}, I = {A,B,C}, U = (n„(i,i))„g{i^2},iG{A,iJ}jG{A,B,C}- (Note that in 
this example, only the second player has three possible choices). 



RR n° 6653 



26 



P. Coucheney, C. Touati, B. Gaujal 



The payoff matrix is: 

{ui,U2){i,j) = 

The companion game is: 
{ri,r2){i,j) = 



(6,3) (-3,11) (-3,10) 
(0,2) (-1,1) (0,10) 



(7,12) (-3,11) (-3,10) 
(0,2) (-11,0) (0,10) 



The original game has one single pure Nash equilibria which is {B, C) 
resulting in the value 10 for the potential function and no mixed strategies 
equilibria exists. 

The companion game has two pure Nash equilibria that are {A, A) and 
{B,C), corresponding to values of the potential function of 9 and 10 respec- 
tively. 

Denote x the probability for player 1 to choose action A and yi (resp. 
7/2 J the probability for player 2 to choose action A (resp. B). Then, the 
global maximum of the potential function is 10, and is attained when x = 
2/1=2/2 = 0. Figure \7\ shows that the trajectory starting at point (^,5,5) 
converges to the local maximum (1,1,0), corresponding to Nash equilibrium 
{A, A) of the companion game, which is inefficient. Interestingly in this 
example, the unique Nash equilibrium of the original game corresponds to 
the global maximum of the game. 




Figure 7: Example with 2 players. The first one has 2 choices and the 
second one has 3 choices. Here we display the 3- dimensional plot of yi vs x 
and 2/2 vs x. The dynamics starting in (1/2, 1/3, 1/3) converges to the point 
(1, 1,0) whereas the global maximum is (0,0,0). 
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5 Numerical study 

This section is devoted to implementation issues and shows the numerical 
tests that were performed so as to study several possible practical heuristics 
based on the algorithm. 

First, notice that in the algorithm, users only need to know the reper- 
cussion utility on their current cell to compute their new strategy vector. 
Also, each base station only needs to know her own load to compute the 
repercussion utilities, hence allowing for a fully distributed algorithm. 

During the execution of the algorithm, at each time slot (typically, frames 
are sent every 40 ms for video transmission) , each user executes the algorithm 
independently, updates her probability vector, makes a choice according to 
her strategy and sends a packet to the corresponding base station. Mean- 
while, each base station measures the throughputs of all mobiles connected 
to it and computes the corresponding repercussion utilities. Then, it sends 
to every user their individual repercussion utility. 

Once a user reaches a pure strategy, she informs all the cells she has access 
to. Each cell waits for all users connected to her to converge before asking 
them to monitor their repercussion utility. From then on, any variation of 
the load is due to an arrival or departure in the cell. Hence, upon detection of 
a change of her repercussion utility, each user reruns the algorithm, starting 
with a new probability vector. 

In the previous theoretical sections, convergence of the algorithm have 
been shown when the step size e tends to 0. Here, we present several simple 
heuristics with different step size computation methods. While the conver- 
gence step should be small enough to ensure convergence, larger values are 
preferable to decrease the algorithm runtime. Hence, appropriate trade-offs 
need to be examined. 

In the first subsection, we present the different heuristics (Subsection lS.ip . 
We then present the scenario to be simulated (in terms of number of users and 
network topology) (Subsection 15. 2|) . To perform the tests, realistic through- 
puts need to be chosen for different combinations of loads, i.e. values of u{P) 
for each possible load We provide such values in Subsection 15.31 We then 
compare the results obtained by the different heuristics, in terms of efficiency 
(the quality of the solution) and convergence speed (Subsection 15. 4p . We 
briefly comment in Subsection 15.51 on the impact of fairness on the resulting 
association. Finally, in Subsection 15.61 given the best heuristic, we provide 
experimental results about: the scalability of the algorithm on the system 
size, the adaptation to arrival or departure of a mobile, the comparison with 
other policies, and the adaptation to different kind of traffic. 

5.1 The Different Heuristics for the Steps 

Each heuristic actually consists of two parts: 
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A stopping test As time increases, the probabilities of choosing each action 
tends either to or 1. So as to speed up convergence, we consider thresholds 
6m and 6m such that: 



Vn G TV , V. G I ^^^^^^ + i) ^ i if g^ .^t) > 1 



m 

6m- 



When one of this operation is done, the strategies are normalized to remain 
in the strategy set A, and to preserve the condition q-n^i = 1. In the 

tests, we fix 6m = 0.05 and 6m = 0.3. 



A step size computation : different schemes to compute e„(t) are consid- 
ered. 



5.1.1 Constant Step Size (CSS) 

In this heuristic, the step size is predefined and constant throughout time: 
Vn S A/", Vt, e„(t) = e. For low values (CSSl), typically e = 0.01, the algo- 
rithm converges in almost all cases to the optimal solution, but at the cost 
of a high number of iterations. For high values (CSS//), typically e = 1, the 
convergence and the optimality are not guaranteed anymore. Intermediate 
values (CSSm), typically e = 0.1, are possible compromises. 



5.1.2 Constant Update Size (CUS) 

At each time epoch, each user computes the maximum step size so that the 
change of probabilities for all choices, is bounded by a predefined value F 
(fixed to 0.1 in the experiments): 

Vn G A/", Vi G In, abs (g„,i(t + 1) - g„,i(t)) ^ F. 

By bounding the update of every user, this scheme yields smooth changes in 
the strategy vectors and hence can be expected to follow the behavior of the 
differential equations. 



5.1.3 Decreasing Step Size (DSS) 

The underlying idea of this scheme is to use a few iterations with large steps 
before using some smaller step sizes. Indeed, a big step size lets actions 
associated to large repercussion utilities to quickly get high probabilities of 
occurrence. Since the algorithm converges to a Nash Equilibrium regardless 
of the initial conditions, using a few large steps amounts in changing the 
initial conditions so as to get close to extrema points, and hence to possible 
pure strategies Nash Equilibria. Then, the following iterations with smaller 
step sizes correspond to a good approximation of the CSSl algorithm. These 
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steps confirm (or infer) the fact that the extremal point closer to the one 
obtained after the first iterations is (or not) a Nash Equilibrium. 

We consider two variants of the decreasing step size mechanism. The 
first one is a cyclic decreasing step size (DSSSA) (in the experiments, e = 
3/(t mod 10)). During each cycle a Nash equilibrium candidate is tested. 
This is inspired from simulated annealing approaches. 

The second variant (DSSCSS) is a decreasing step size phase followed by 
a constant large step size (in the experiments, e = 4/t if t < 120 and e = 4 
otherwise). The underlying idea is that the first phase would stabilize a cer- 
tain number of users. Then, a large step size should improve the convergence 
speed of the others to their respective preferable choices. 

5.2 System Scenario 

We consider a simple scenario of an operator providing subscribers with a 
service available either through a large WiMAX cell or a series of WiFi hot 
spots. 

For each simulation, a topology is chosen randomly, according to 3 pa- 
rameters (the number of users, the number of WiFi hot spots and the number 
In of possible choices for each user). More precisely, for each user: 

• The first choice is the WiMAX cell and one of the 8 possible zones (as 
defined in Section [5. 3p . picked at random (uniformly). 

• AH other In — 1 choices are one of the Wifi cells, picked up according 
to a uniform law. As explained in Section 15.31 we consider that all 
mobiles in a common Wifi cell receive the same throughput. 

The strategy vector is initialized with equal probabilities: Vn G A^, Vi S 

5.3 Throughput of TCP sessions in WLAN and WiMAX 

Computing the throughput experienced by a packet in a wireless environment 
is extremely hard due to the complexity of the physical system (as opposed 
to wired system, where tlie physical medium is separated from the outside 
world, and hence has reliable properties, the wireless link quality changes at 
every instant, due to the environment: air quality, buildings and physical 
obstacles, etc). Therefore, actual closed formula available in the literature 
were obtained using strong assumptions on the outside world and do not refer 
to throughput of a single packet but of means of flows. Indeed, as the number 
of packets in any connection is large, the flow is usually approximated as a 
fluid. 

In addition, the useful throughput of a connection, also called goodput 
depends on the network protocol. Roughly speaking, two main elements 
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Figure 8: Capacity of a WiFi cell as a function of its load (in bit/s). The 
maximum is reached with 3 users. 

have strong impact on the achieved goodput: first is the physical system, 
which depends on the technology in terms both of maximum capacity and 
multiplexing technology, second is the transport protocol. In this simula- 
tion study, we consider the case of TCP flows for which good throughput 
approximations are available in the literature. Yet, the use of UDP flows, 
or a mixture of TCP and UDP flows do not impact the performance of the 
algorithm. (Note that allowing users to use either TCP or UDP protocol for 
their transmission amounts, in the algorithm, to considering an additional 
zone in the network cell.) 




Equations of throughput in WiFi cells Based on [19], we consider that 
the throughput of connection i is 

Unitis)) = ^^^^ 



I' {Tdata + Tack + 2Ttbo{1') + 2Tw{l')) 



where P = J2n&j\f^n the number of mobiles connected to network i, 
Ltcp = 8000 bits is the size of a TCP packet. Tack is the raw transmission 
times of TCP ACK (approximately 1.091 ms), Tdata the raw transmission 
times of a TCP data packet (about 1.785 ms). Then, Tyy and Ttbo are 
the mean total time lost due to collisions and back-offs respectively. These 
depend on the collision probability of each packet, and hence on the load 
of the network. This collision probability can be numerically obtained via 
a fixed point equation given in (TH]. Figure [H] displays the throughput of a 
WiFi cell, as a function of the load. 
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Figure 9: Average performance of the heuristics 

{CUS,DSSSA,DSSCSS,CSSl,CSSm and CSSh resp.) with dif- 
ferent loads (with 5% confidence intervals). 



WiMAX As opposed to WiFi, the WiMAX technology uses OFDMA mul- 
tiplexing. Hence, each user receives a certain number of carriers which are 
converted into a certain amount of throughput depending on the chosen 
modulation and coding scheme, which greatly depends on the link quality at 
the receiver side. We consider a fair sharing in terms of carriers [20], i.e. if p 
users are present in the WiMAX cell, each of them will receive NbSCarriers/p 
sub-carriers, similarly to processor sharing. Hence, the goodput experienced 
by a user in zone z (corresponding to a coding scheme) is roughly the fraction 
1/p of the throughput she would obtain if she were alone in the cell. 

For a single user within the WiMAX cell, we follow experimental values 
obtained in [21] for IEEE WiMAX 802. 16d for its eight zones: 



Modulation 


QAM64 3/4 


QAM64 2/3 


QAM16 3/4 


QAM16 1/2 


TCP goodput 


9.58 


8.88 


6.80 


4.50 


Modulation 


QPSK 3/4 


QPSK 1/2 


BPSK 3/4 


BPSK 1/2 


TCP goodput 


3.37 


2.21 


1.65 


1.08 



5.4 Comparisons between Heuristics 

Figure [9] displays the performance (in terms of global throughput) obtained 
by the six heuristics (CUS, DSSSA, DSSCSS,CSSl,CSSm and CSSh 
resp.) as a function of the total number of users N. For a given load, all 
heuristics have been tested on the same topology to allow a fair comparison. 

The small constant step size [CSSl with e = 0.01), provides the best 
performance. It is are even tested optimal for the small values of A^, up to 
20. 
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Figure 10: Average number of iterations before convergence of heuristics 
{CUS,DSSSA,DSSCSS,CSSl,CSSm and CSSh resp. ) for different 
loads (with 5% confidence intervals). 



Most heuristics stay within 10 % of the optimal (except for DSSCSS 
whose performance can be poor). Also note that the total capacity of the 
system is less than 36 (10 * 2.6 (WiFi) + 9.58 (WiMAX)) Mbit/s. Thus 
the best heuristic is always within 5 % of the optimal. Finally, it should be 
noted that the medium constant step size (CSSm) with e = 0.1 is always 
very close to the best {CSSl) and that the constant update size (CUS) 
performs better and better when the number of users grows. 

As for the number of iterations, it varies widely between the different 
heuristics, even on a logarithmic scale (see Figure [TO]l . The CUS heuristic 
is a clear winner here (with an average number of iterations never above 
80). Meanwhile, CSSl does not always converge within the limit of 20,000 
iterations set in the program. 

Under high loads, CUS provides the best compromise with very fast 
convergence and reasonable performance. Under light load, the constant step 
size of medium size [CSSm] is also an interesting choice, for its performance 
is almost optimal and its number of iterations remains below 100. 



5.5 Impact on Fairness 

Consider the following scenario: a set of 20 users, each having 3 available 
choices among 10 cells. The WiMAX cell is numbered and its 8 zones are 
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numbered from to 7. The set of choices of the users are / = 

{{0,1},{8},{1}} {{0,5}, {6}, {4}} {{0,1}, {6}, {9}} 

{{0,2}, {2}, {6}} {{0,3}, {8}, {9}} {{0, 6}, {4}, {9}} 

{{0,7}, {3}, {6}} {{0,4},{1},{2}} {{0,6}, {6}, {9}} 

{{0,5}, {3}, {4}} {{0,6},{3},{1}} {{0,7}, {9}, {6}} 

{{0,3},{8},{1}} {{0,6}, {4}, {7}} {{0,6}, {9}, {5}} 

{{0,0}, {6}, {5}} {{0,5},{4},{1}} {{0,6}, {6}, {4}} 

{{0,3}, {3}, {4}} {{0,3}, {8}, {4}}. 

The optimal association scheme, for a = (efficient scheme) and a = 2 
(fair schemes) are respectively: 

A,s = {2,1,2,1,1,1,1,2,2,2,1,1,2,2,2,0,2,1,1,1}, 
Afair = {0, 1,0, 1,0, 2, 1, 2, 1, 1,2, 1, 1,2, 2, 2, 1, 2, 0, 1} 

resulting in throughputs of: 

reff = 0.824, 1.225, 0.824, 1.225, 1.225, 1.225, 0.824, 1.225, 0.824, 1.225, 
0.824, 0.824, 0.824, 2.245, 2.246, 9.58, 0.824, 1.225, 0.824, 1.225. 

rfair = 2.22, 1.225, 2.22, 1.225, 1.125, 1.225, 1.225, 1.225, 1.225, 1.225, 
2.245, 1.225, 1.225, 2.246, 1.225, 1.225, 1.225, 1.225, 1.125, 1.225. 

The efficient scheme achieves a total throughput of 31.29 Mb/s. The 
fair scheme suffers a degradation of slightly less than 10%, with a total 
throughput of 28.34 Mb/s. Yet a closer look at the figures indicates that the 
efficient scheme leads to high differences between users (user 1 only obtains 
a throughput of 0.8 Mb/s while user 16 is granted 9.58 Mb/s). As for the 
fair association scheme, on the other hand, all users benefit from through- 
puts higher 1.1 Mb/s. As in bandwidth allocation mechanisms in wired 
systemsfH], the parameter a hence allows to finely tune the compromise 
between maximum global throughput and fairness between users. 

To understand these differences, let us compare the loads between the 
associations: 

i^ff^' = {3, 2, 3, 2, 1, 2, 1, 2, 3}, L™f = {1, 2, 2, 2, 2, 2, 1, 2, 2} . 

Prom Fig. [HJ one can see that the maximum capacity for the WiFi cells is 
obtained for a load of 3 users. Hence, the efficient scheme tries to obtain 
as many cells with load 3 as possible. Meanwhile, the WiMAX capacity is 
maximal when its users all belong to zone 0. Hence, such users are automat- 
ically associated to this cell (in our case there is only one such user, which 
obtains a throughput of 9.58 Mb/s). 

On the other hand, the fair scheme tries to find balanced association 
schemes. Hence, the loads of the different WiFi cells are close to one anotheJl 



^Note that they cannot be strictly equal due to the discrete nature of the problem. 
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(here ranging between 1 and 2) and the WiMAX cell is associated to some 
users belonging to efficient zones. Their number is chosen so as to obtain 
similar performance as for users remaining in the WiFi cells. 

Hence, while purely efficient schemes produce lightly loaded WiMAX 
cells (with only the users in zone 0), the fair scheme leads to more balanced 
loads (here, 4 users in the WiMAX cell and about 2 users in each WiFi cell). 

5.6 Further simulations 

While very small constant step sizes provided limit points with near optimal 
performance, all heuristics but CUS needed several thousand steps before 
convergence for scenarios with more than 10 users and/or cells. The number 
of steps for CUS never topped 100 and its limit points also proved very good 
(a few percent of the optimal). All simulations reported in this subsection 
use the CUS heuristic. 

5.6.1 Scalability 

Here, we investigate the impact of the number of mobiles and the number of 
cells each mobile can connect to on the speed of convergence (Figures lll|12p ). 
Unlike in the previous section where the criterion of convergence speed was 
the number of iterations of the algorithm, here, we measure the average 
number of handovers for a mobile before convergence. It can be argued 
that this new measure of convergence is more relevant since handovers are 
costly for mobiles. Figures ril|12l show that the mean number of handovers 
is smaller than 20 when mobiles have 2 choices, and smaller than 25 when 
mobiles have 3 choices, even for large numbers of mobiles. 

5.6.2 Adaptation to Arrivals and Departures 

The association algorithm has to be run at every arrival or departure of a 
user in a cell. Here, we simulate the occurrence of such events. Typical time 
scales compare nicely: while arrivals or departures of users in WiMAX or 
WiFi cells occur every minute or so, the association algorithm converges in 
less than a second in most cases. 

In Figures [131 El the arrivals follow a Poisson process. Each incoming 
mobile has a message of exponential random size to download. One unit 
of time corresponds to the duration of an iteration of the algorithm. In 
the second figure, white noise may model perturbations on the cell capacity 
(fading) as well as errors on the measures of the real throughput. 

5.6.3 Comparison with Naive or Sub-optimal Methods 

In this section, we compare our algorithm to naive allocation methods for 
incoming mobiles. 
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Figure 11: Mean number of handovers for a mobile wlien she has 2 choices, 
as a function of the total number of mobiles (full lines represent the average 
measure and the upper and lower 5% confidence interval). 
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Figure 12: Mean number of handovers for a mobile when she has 3 choices, 
as a function of the total number of mobiles. 
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Figure 13: Adaptation to arrivals and departures: the heuristic smoothly 
and quickly reconverges after state change. 
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Figure 14: Stability with respect to measurement errors: behavior of the 
algorithm when the throughput of all cells has a white Gaussian noise with 
0.45 variance. 



INRIA 



Mobile Centric Network Association Algorithm 



37 



Percentage of efficiency gain 

25 I 1 1 — 

20 - 

15 - 

10 - I 



10 20 30 40 50 60 . , 

Number of mobiles 

Figure 15: Percentage of efficiency gain by using our algorithm in compar- 
ison to the fixed choice of WiFi cell for each incoming mobile. The number 
of mobiles is variable, but the number of WiFi cells is fixed to 15. 

Fixed Allocation to a WiFi Cell. The first naive method for a mobile 
consists in always connecting to a WiFi cell if it is possible. It is inspired 
by the only currently deployed technology implementing vertical handovers 
called GAN (Generic Access Network), also known as Unlicensed Mobile 
Access (UMA). Actually, GAN only enables to switch between WLAN and 
GSM/UMTS. The capacity of WLAN networks is so much larger than the 
one of GSM/UMTS networks that switching to WLAN network whenever 
possible is almost always a good choice. That is why the network selection of 
GAN is very basic: the handset gives absolute preference to 802.11 networks 
over GSM. However, the GAN selection scheme is unlikely to be efficient 
in more complex settings, especially when the load of WiFi cells becomes 
very large and when WiFi cells compete against WiMAX or LTE cells whose 
performance are closer to WiFi than UMTS. Figure [15] shows the relative 
improvement of our algorithm compared to GAN-like approach. 



Allocation to the Best Cell. As for this second naive method, an incom- 
ing mobile acts selfishly: she probes all available cells and always connects 
to the one that offers the best throughput at connection time and does not 
change ever after. Figure (fT6l) shows the difference of the global throughput 
when we use the both methods of association. We see that our algorithm 
achieves a significant better throughput than the selfish method. This is yet 
another illustration of the fact that selfish behaviors lead to a bad use of the 
resources. 
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Figure 16: Evolution of the global throughput using the association algo- 
rithm ("algo") and greedy probing ("selfish"). At time 0, the configuration 
is the same, and the arrival processes of users are identical in the two cases. 
Since the throughputs for mobiles are different in the 2 schemes, the depar- 
ture times are different. The mean performance in this period of time is 40.1 
for (algo), and 29.9 for (selfish). 

Comparison with the Throughput as Payoff. At last, we compare 
our algorithm when we use the repercussion utility as payoffs for mobiles 
(Section l3.2p which ensures the convergence to an locally optimal point, to 

def 

the same algorithm when the payoff is equal to the throughput: = Un for 
all users. See Figures I17|18l and [HI Here the gain is much lower but both 
algorithms roughly have the same convergence time. 

5.6.4 Real-Time Traffic vs Elastic Traffic 

The question here is to know whether real time traffic can be taken into 
account in the algorithm. In fact, for elastic traffic, utility for users is inti- 
mately related to the throughput they receive. For real time traffic like voice 
or video transmission, users require a certain level of throughput. Hence the 
idea is to build a different utility function for these users. 

The first idea is to have a null utility if the throughput is under a certain 
threshold, and a utility equal to 1 otherwise. The algorithm works well with 
this utility but is long to converge because the discontinuity causes a bang- 
bang behavior of the users. This problem can be avoided by transforming 
the utility function: under the threshold the utility is still 0, and becomes 1 — 
exp{—Un{i^")) above it. This provides good solutions in terms of convergence 
speed as well as a good overall utility. In Figure [201 we show the behavior 
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Figure 17: Percentage of efficiency gain when using repercussion utilities 
instead of throughputs, when the number of mobiles varies. The ratio of the 
number of WiFi cells divided by the number of mobiles is constant and equal 
to 1 over 5. 
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Figure 18: Similar to Figure [171 but the number of WiFi cells varies and 
the number of mobiles is constant and equal to 30. 
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Figure 19: Like Figure [THl but with a constant number of mobiles equal to 
20. 
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Figure 20: Dependency of the time of convergence when the ratio of elastic 
traffic varies. The number of mobiles is 30. 

of the time of convergence of this heuristic when the ratio of real-time traffic 
vary. The impact of this ratio on the time of convergence is not significant. 



5.6.5 A Dynamic Scenario: between Mice and Elephants 

Here, we consider that the global traffic is shared by two kinds of traffic called 
mice and elephants. The mouse traffic corresponds to short lived connections 
(< 1 second) and the elephant traffic to long connections (up to one minute). 
There are relatively few elephants and a large number of mice (90%), but 
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Figure 21: Traffic made of 30 initial users with 90% mice. Average packet 
size for elephants is 20 times the average packet size for mice. The figure 
shows the total throughput when all users apply the algorithm. The average 
total throughput is 39.05M6/s. 

globally, the ratio of elephant traffic represents approximately 85% of the 
global traffic. Whereas our algorithm is well adapted to elephant traffic, 
since the time of convergence is negligible with respect to the duration of 
the connection, it is not the case for mice traffic. In Figures [2T] and [22l we 
compare two scenarios, when both mice and elephants use the algorithm and 
when only elephants do so (while mice always connect to one WiFi cell). The 
second method reduces the number of handovers and preserves the overall 
throughput (even giving a small gain) as seen in the Figures ED and [22l 

At last. Figure [23] shows the performance gain when we apply the al- 
gorithm for mice and elephants in comparison with applying it only to the 
elephants. It points out the fact that both methods have a similar efficiency, 
but the second ensures a low rate of handovers. It is interesting to notice 
that this is independent of the ratio of mice traffic. That means that the loss 
of throughput due to the algorithm (which is important when the percentage 
of mice is high), is balanced by the loss of optimality of the second method. 



6 Conclusion and Future Works 

In this paper, we have designed a distributed algorithm that selects an effi- 
cient (in terms of fairness or global throughput) network association in het- 
erogeneous wireless networks. Simulations show that this method is relevant, 
in comparison with naive method. This opens the way to several interesting 
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Figure 22: Same configuration and arrival process as in Figure EU In 
this figure, mice are directly allocated to the WiFi cell without applying the 
algorithm. The mean throughput is 39.19M6/s. 
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Figure 23: Percentage of gain by running the algorithm for mice and ele- 
phants instead of running it only for elephants as a function of the percentage 
of mice traffic ( the global traffic average remains constant). 
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future works, such as the implementation of such methods in modern mobile 
devices in collaboration with Alcatel-Lucent. 
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